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0 (54) Title: IMPROVED IMAGE SEGMENTATION PROCESSING BY USER-GUIDED IMAGE PROCESSING TECHNIQUES 
IT) 

(57) Abstract: Machine analysis of an image to segment the image for subsequent processing is guided by input from a human 
operator. A convenient drawing tool allows the user to select an object in an image. Machine image segmentation processing is 
constrained to a region of interest indicated by the operator using the drawing tool. Through the combination of opeator input and 
machine analysis, including edge detection, the object's boundaries are detected accurately. A resulting key signal may be manipu- 
lated by the operator in a number of respects and outputted for use in subsequent processing of the image, including color correction. 
Automatic segmentation is also applied to each image in a video clip based on a region of interest indicated by the operator on a first 
image of tbe clip. The region of interest is repositioned from image to image on the basis of detected motion of the object indicated 
by the region of interest 
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IMPROVED IMAGE SEGMENTATION PROCESSING 
BY USER-GUIDED IMAGE PROCESSING TECHNIQUES 

BACKGROUND OF THE INVENTION 

This invention relates to systems and methods for processing image signals. 
5 More particularly, the present invention pertains to improved systems and methods for 
segmenting images to generate key signals and for other purposes, and further is 
concerned with manipulation of key signals. 

It is frequently desirable to segment an image plane so that only a portion of 
an image displayed in the image plane is selected for processing. For example, image 

10 segmentation is employed to select a particular object in an image for color correction 
separate from the balance of the image. According to one conventional image 
segmentation technique, the human operator attempts to fit a rectangular or circular 
window to an object to be color-corrected. Since most objects to be selected are 
neither circular nor rectangular, the fitting of the window to the object is usually 

15 inexact to a considerable extent. Even if the edges of the window are blurred or 
softened, the resulting color correction applied to the area of the window often 
produces unsatisfactory results. 

In another known technique, referred to as "rotoscoping," the operator draws 
the boundary of a window under high magnification at a pixel-by-pixel level to 

20 outline the boundary of an object to be selected for color-correction. The rotoscope 
technique can result in windows that are very precisely matched to the object's 
outline, thereby producing high quality results. However, this technique is very 
time-consuming and labor-intensive, and therefore costly. 

Other image segmentation techniques rely on color keying. According to one 

25 technique, the human operator draws a free-hand closed line-figure that entirely 
surrounds an object to be selected, and then draws a second free-hand closed 
line-figure that is entirely within the object. The computer then examines the colors 
of the pixels between the two line-figures to determine whether the pixels match those 
of the interior of the object or those of the background. This technique is unlikely to 

30 produce satisfactory results unless the object to be selected is of a contrasting color 
relative to the background. 
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Various other proposals have also been made for semi-automatic image 
segmentation processing in which a human operator provides some guidance to an 
object boundary finding algorithm to be carried out by a computer processor. 
Examples of these proposals are disclosed in the following U.S. patents: 
5 No. 5,247,583, issued to Kato et al. and entitled, "Image Segmentation 

Method and Apparatus Therefor;" 

No. 5,617,487, issued to Yoneyama et al and entitled, "Image Cutout 
Apparatus;" 

No. 5,181,261, issued to Nagao and entitled, "An Image Processing Apparatus 
10 For Detecting the Boundary of an Object Displayed in Digital Image;" 

No. 5,887,082, issued to Mitsunaga et al. and entitled, "Image Detecting 
Apparatus." 

However, to the best of applicants' knowledge none of these prior proposals 
have been embodied in a commercially available segmentation apparatus that can 

15 reliably identify object boundaries in a wide range of circumstances, and even when 
the object of interest shares color characteristics with the background of the image. It 
appears that prior proposals have failed to find an optimal combination of 
sophisticated image processing techniques and flexible options for the operator to 
guide the image processing techniques. Moreover, it is believed that the prior art has 

20 to date focused on still image segmentation, and has failed to consider how 
human-guided computerized image segmentation can be applied to dynamic 
sequences of images. 

It would be desirable to provide an image segmentation technique in which an 
object to be processed, whether in a still image or a dynamic sequence of images, can 

25 be accurately and reliably identified and its boundaries outlined, without requiring 
laborious detailed input from a human operator. 

OBJECTS OF THE INVENTION 

Accordingly, an object of the invention is to satisfy the above needs and to 
provide a system and method for segmenting images with increased accuracy, 
30 efficiency, speed and convenience. 

A further object is to efficiently apply an image segmentation algorithm to a 
dynamic sequence of images with limited operator guidance. 
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Another object of the invention is to provide an apparatus and method which 
quickly and accurately generate a key signal to isolate a desired object for color 
correction or other image processing. 

An additional object of the invention is to provide an improved user interface 
5 for generating matte and key signals. 

Further objects of the invention are concerned with providing improved 
techniques for manipulating and adjusting key signals. 

SUMMARY OF THE INVENTION 

The invention satisfies the needs identified above and meets the foregoing 
10 objects by providing a method in which flexible tools for guidance by a human 
operator are combined with sophisticated machine analysis techniques to produce 
better and more accurate object selection windows than have heretofore been 
practical. 

In a method provided in accordance with a first aspect of the invention, a first 

15 image of a sequence of images is displayed on a display device, and a region of 

interest is designated by the operator. An image segmentation algorithm is applied to 
the first image to generate an outline in the region of interest, the image segmentation 
algorithm being constrained to operate only within the region of interest. Another 
algorithm provides an indication of the motion of an object corresponding to the 

20 outline between the first image and a second image of the sequence of images and the 
region of interest is repositioned on the basis of the indicated motion of the object. 
The image segmentation algorithm is then applied to the second image to generate a 
second outline in the repositioned region of interest. 

According to another aspect of the invention, a method of segmenting an 

25 image plane on the basis of features of an image displayed in the image plane includes 
the following steps: displaying the image on a display device, using a drawing device 
to superimpose a free-hand drawing figure on the image displayed on the display 
device (the free-hand drawing figure defining a band-shaped region of interest in the 
image plane formed as the locus of a circle moved in an arbitrary manner), applying 

30 an image analysis algorithm to the displayed image (the image analysis algorithm 
being constrained to operate only within the region of interest defined by the 
free-hand drawing figure and the algorithm operating without reference to any portion 
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of the image outside of the region of interest), and segmenting the image plane on the 
basis of a result provided by application of the image analysis algorithm. 

According to yet another aspect of the invention, a process for extracting 
features from an image includes applying an edge detector algorithm to pixel 
5 information arrayed in a region of interest in an image plane. The edge detector 

algorithm generates edge information from the pixel information. The process further 
includes the step of applying a bias function to the edge information to emphasize 
components of the edge information at the central portions of the region of interest, 
thereby producing biased edge information. 

10 •'" According to a further aspect of the invention, an edge-modulated softness 

function is provided with respect to a key signal. In accordance with this aspect of the 
invention, a key boundary is generated by means of an edge detection algorithm, and 
the algorithm generates for each pixel on the key boundary edge-degree data which 
indicates a degree of definiteness of an edge at the respective pixel. A softness 

15 function is adjusted along the key boundary in dependence on the edge-degree data. 
The degree of softness is increased at points on the key boundary where a less definite 
edge was found. 

According to still another aspect of the invention, a softness function is 
adjusted on the basis of an operator input signal to provide a "clean up" function. In 

20 implementing the clean up function, a key boundary is generated, a first region 

bordered by the key boundary is designated to be an inside region and a second region 
bordered by the key boundary is designated to be an outside region. A softness 
function is applied to the key boundary to generate a gradient in a key signal between 
the inside region and the outside region. In response to a control signal input by a 

25 human operator, the softness function is adjusted so that the slope of the gradient is 
increased on a side adjacent to the outside region without changing the slope of the 
gradient on a side adjacent the inside region. 

The features of the invention allow for highly efficient image segmentation, in 
which a desired object in a dynamic stream of images may be selected for subsequent 

30 processing with great accuracy. The results obtained rival those which could be 
achieved in the prior art only by use of a rotoscope, but without the tedious and 
extremely time consuming high-magnification work required by the rotoscope. As 
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compared to the rotoscope, the present invention represents an orders-of-magnitude 
improvement in speed and convenience. 

Other significant features of the invention include a user interface that is 
highly intuitive, and easy to learn and to use. In addition, unique key-signal 
5 manipulation tools are provided which further enhance the utility of the invention. 

The techniques of the present invention may advantageously be embodied in 
an external matte/key generator to be provided as a peripheral device for a color 
correction apparatus. The techniques of the present invention are also applicable to 
many other functions, such as image compositing, editing of still images generally, 
10 desk top publishing applications, video and motion picture production, colorizing of 
black and white films, and 3-D graphics displays. 

It is also contemplated to include at least some of the capabilities of the 
present invention in image processing software of the types distributed to consumers 
and professional artists for operation on standard personal computers. 
15 BRIEF DESCRIPTION OF THE DRAWINGS 

The above and other objects, features, and advantages of the present invention 
will become apparent upon consideration of the following detailed description of 
illustrative embodiments thereof, especially when taken in conjunction with the 
accompanying drawings, wherein: 
20 Fig. 1 is a block diagram of an image processing system in which the present 

invention is applied. 

Fig. 2 is a block diagram of personal computer hardware which may constitute 
a portion of an image segmentation component shown in Fig. 1 . 

Figs. 3 and 4 together schematically illustrate image segmentation and key 
25 signal manipulation processes carried on in accordance with the invention. 

Figs. 5A and 5B pictorially illustrate key signal manipulation processes 
earned out in accordance with the invention. 

Fig. 6 is a screen display which shows an image to be processed for image 
segmentation as well as certain control options made available to a human operator. 
30 Fig. 7 is a screen display similar to Fig. 6, but also showing a partial drawing 

figure superimposed on the image to select a portion of the image for color correction. 
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Fig. 8 is another screen display, showing the complete drawing figure which 
selects an image portion for color correction. 

Fig. 9 is another screen display, showing a region of interest designated by the 
human operator for image segmentation purposes, as well as associated inside and 
5 outside regions and an extended region of interest. 

Fig. 10 is a pictorial illustration of certain calculations included in processes 
shown in Fig. 3. 

Fig. 1 1 is another screen display, illustrating the locus of a center of the region 
of interest. 

10 Fig. 12 is another screen display, showing a mapping of edge detection 

information generated by reference to luminance image information. 

Fig. 13 is a screen display similar to Fig. 12 but showing a mapping of edge 
detection information based on color image information. 

Fig. 14 is still another similar screen display, showing a combination of the 
15 luminance and color edge detection maps. 

Fig. 15 is a pictorial illustration of a step included in the processes illustrated 
in Fig. 3. 

Fig. 16 is a screen display similar to Figs. 12 - 14, and illustrating a result of 
applying a biasing function to the combined edge information map shown in Fig. 14. 
20 Fig. 17 is another screen display, illustrating edge gradient data calculated 

from the biased edge data illustrated in Fig. 16. 

Fig. 18 is a screen display illustrating the effect of applying a diffusion 
function to the edge gradient data illustrated in Fig. 17. 

Fig. 19 is still another screen display, showing the image that was processed, 
25 together with an outline which is the outcome of the image segmentation process of 
the present invention. 

Fig. 20 is a screen display which shows a key mask produced from the image 
10 segmentation process. 

Fig. 21 is another screen display, illustrating an outline adjustment mode 
30 provided in accordance with the invention. 

Fig. 22 is a screen display which is similar to Fig. 20, but showing a key mask 
to which a softness function has been applied. 
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Fig. 23 is another screen display, showing how the key mask of Fig. 22 selects 
a portion of the image. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

System Overview 

5 Fig. 1 shows an image processing system 100 in which the present invention is 

employed. The image processing system 100 includes an image information source 
device 102 which provides information representative of images to be processed. In 
the particular embodiment shown in Fig. 1, the image processing system 100 is 
employed for color correction and includes a color-correction device 104 which 
10 receives image information representing images to be color-corrected from the source 
device 102. 

The image processing system 100 also includes an image segmentation device 
106 which receives image information from the source device 102. The image 
segmentation device 106 processes the image information to generate key signals 

15 which are output from the image segmentation device 106 to the color correction 
device 104. The color correction device 104 uses the key signals generated by the 
image segmentation device 106 to control color correction processes in the color 
correction device 104. 

The source device 102 may be any conventional memory or mass storage 

20 device used to store digital image information. The source device 102 may also, or 
alternatively, include conventional film and television record and playback devices 
including telecine transfer systems and film projectors, and video tape and disc 
players and recorders. If such devices are employed, there preferably is a mechanism 
for synchronizing the segmentation device 106 and the source 102 so that the key 

25 signal output from segmentation device 106 is provided to the color corrector 104 

synchronously with the corresponding image from source 102. Alternatively, a digital 
camera or a transmission facility may be substituted for the image information source 
device 102 as the source of the image information fed to the color correction device 
104 and to the image segmentation device 106. 

30 The color correction device 104 may also be a conventional item, and 

preferably is either one of the ColorVision Stealth and ColorVision Copernicus color 
correctors, which are available from the assignee of the present application. 
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Hardware Aspects of Image Segmentation Device 

The image segmentation device 106 is preferably implemented with standard 
PC hardware programmed with software provided in accordance with the invention. 
It is also preferred that the image segmentation device 106 have enhanced digital 
5 image storage capabilities by incorporating an integrated digital disk recorder board 
such as the ClipStation PRO, which is commercially available from DVS GmbH, 
Hannover, Germany. 

Fig. 2 provides a simplified overview of the hardware which makes up the 
image segmentation device 106. The hardware components of the image 

10 segmentation device 106 include a microprocessor 1 10 which controls the over-all 
operation of the image segmentation device 106 and also carries out image 
segmentation processes in accordance with the invention. Connected to the 
microprocessor 1 10 are memory 112, which is a RAM for storing a program to 
control the microprocessor 1 10 and also functions as a working memory, and mass 

15 storage 1 16 in which image information to be processed by the image segmentation 
device 106 may be stored. Of course, program information may also be stored in the 
storage device 116. The mass storage 116 may correspond to the above-referenced 
integrated digital disk recorder board. Or, the mass storage 116 may be a combination 
of the recorder board and a standard hard disk, or simply a standard hard disk alone. 

20 Also connected to the microprocessor 1 1 0 is a data communication interface 

118 through which the image segmentation device 106 receives image information to 
be processed from the information storage device 102 and transmits key signal 
information to the color correction device 104. 

The user interface for the image segmentation device 106 includes a display 

25 device 120 driven by the microprocessor 1 10 and a drawing device 122 connected to 
the microprocessor. In a preferred embodiment of the invention, the drawing device 
122 is the Intuos II stylus and tablet/mouse peripheral which is commercially 
available from Wacom Technology Corporation, Vancouver, Washington. 

The image segmentation device 106 may also include other input/output 

30 components (not shown in the drawing) which are standard in personal computers. 

such as a keyboard, speakers etc. The drawing device 122 may be constituted by only 
one of a stylus/tablet or mouse, and/or by a trackball, light pen or touch screen. 
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Indicating Region of Interest in Image 

Processes carried out by the image segmentation device 106 in accordance 
with the invention will now be described with reference to Figs. 3 and 4. For the 
present discussion it will be assumed that image segmentation device 106 has 
5 received image information from the information storage device 102. The received 
image information may represent a single image to be segmented or may represent 
plural images, including images making up a dynamic sequence of images (e.g. a 
video clip). In the case of processing a video clip, either complete frames or 
individual fields may be processed depending on the origin of the video clip. The 

10 discussion to follow will be concerned with segmentation of images in a video clip, 
but many of the segmentation techniques to be described are also applicable to still 
images. It will further be assumed that a first image in the video clip has been 
selected for processing by the human operator. Accordingly, as shown in Fig. 6, an 
image 208 which is to be segmented for color correction is displayed in an image 

15 window 210 in a graphical user interface screen 212. At this point, and in accordance 
with block 410 in Fig. 3, the human operator is permitted to input signals by means of 
the drawing device 122 to implement a software drawing tool by which the operator 
roughly indicates a desired segmentation of the image 208. For purposes of 
illustration, it will be assumed that the task to be performed is color correction of the 

20 flesh tones of the model 214 who is seen in image 208. The width of the drawing 

figure to be drawn by the drawing tool can be adjusted by means of slide bar 215 and 
the currently selected width is indicated at 217. 

Fig. 7 is a screen display similar to Fig. 6 but showing a region 216 which is 
generated by the image segmentation device 106 by operation of the software drawing 

25 tool in response to signals input by the human operator via the drawing device 122. It 
will be seen that the region 216 is a partial rough outline of portions of the image 
which correspond to the model 5 s skin. The region 216 is in the form of an extended 
band. The region 216 is defined as the locus of a circle moved in an arbitrary manner 
as indicated by the drawing device 122. The region 216 also corresponds to the 

30 portion of the image plane between a pair of substantially parallel free-hand drawing 
lines 218 and 220 which are simultaneously generated on the screen as the human 
operator draws using the drawing device 122. The lines 218 and 220 are 
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"substantially parallel" in the sense that the distance across region 216 in the direction 
normal to lines 218, 220 is substantially constant along the length of region 216. This 
distance across region 216 is equal to the width of the drawing tool as selected by 
means of slide bar 215 and indicated by feature 217. As is common with free-hand 
5 software drawing tools, the lines 218 and 220 may be either curved or straight, and in 
general may be used to define an arbitrary, irregularly shaped region. (As will be 
appreciated by those who are skilled in the art, conventional drawing software 
packages include: (a) shape tools by which predetermined geometric shapes such as 
rectangles or other polygons, circles and ovals are created, positioned and stretched or 

10 shrunk or otherwise manipulated, (b) "connect-the-dots" tools by which straight line 
segments are generated between control points established by the user, and (c) 
free-hand tools in which a line is generated on the screen without any restriction as to 
shape and governed solely by the locus through which the mouse or other drawing 
instrument is moved, akin to doodling with a pencil on a piece of paper. The software 

15 drawing tool which generates region 216 is of the latter type, having a user-adjustable 
width, which is also a conventional feature.) 

Although the region 216 may be drawn with a single continuous stroke, this is 
not required. The region 216 may also be indicated with multiple disconnected 
strokes, may have an irregular border, may be defined by repeated short motions or 

20 sketching by the drawing device, may be filled by additional strokes along the outside 
and/or the inside, and may have multiple branches and regions. There is no restriction 
on the manner or order in which the region is drawn and there is no restriction on the 
shape of the region. However, for best results the region should be shaped and 
positioned so that the desired object boundary is approximately at the center of the 

25 region. The region 216 may be indicated on the screen display by changing the 

luminance level and/or a color tint in the region relative to the balance of the image. 
Features of the underlying image remain visible in the region 216 (see, e.g., the 
model's right hand at 222) and thus are not occluded by the region 216. 

Fig. 8 is another screen display similar to Figs. 6 and 7, but showing the region 

30 216 after it has been completed so as to surround the entire portion of the image 

which corresponds to the model's skin. Each of the lines 218 and 220 (which together 
define the region 216) forms a respective closed freehand figure, with the figure 
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defined by line 220 being entirely contained within the figure defined by line 218. It 
will be understood that the region 216 itself constitutes a free-hand drawing figure. In 
the display shown in Fig. 8, the background 224 of the image, corresponding to the 
area outside of the region 216, is at a reduced luminance relative to the region 216, 
5 and the inside 226 of the area selected for color correction is at an increased 

luminance level relative to the region 216. This is indicative of the operator having 
designated area 216 to be an "inside" region and area 224 to be an "outside" region 
relative to the highlighted region 216. The designation of the "inside" region may be 
made automatically by the system 100 (e.g. by selecting the smaller of two regions 

10 . partitioned in the image plane by the region 216) or may be designated by the 

operator. A designation made by the system may be over-ridden by the operator. 

By drawing the region 216 in the image plane, the human operator indicates to 
the image segmentation device 106 a specific, limited portion of the region in which 
the image segmentation device is to perform image segmentation processing to find 

15 boundaries of an object to be color corrected. The region drawn by the operator may 
be referred to as a region of interest (ROI), and appears in Fig. 9 as a shaded freehand 
drawing figure 216, corresponding to the "highlighted" region 216 of Fig. 8. An 
inside region 232, indicated in white in Fig. 9, is bordered by the region of interest 
216 and represents a portion of the image which is entirely inside the object selected 

20 by the operator. A background or outside region 234 is also bordered by the region of 
interest 216 and is indicated in dark tones in Fig. 9. In connection with the display of 
Fig. 9, the operator may again have the option to modify the region of interest 216 by 
using the drawing device 122. 

In the example which has been illustrated hereinabove, a single object was 

25 selected for color correction by means of a single closed drawing figure which defines 
a single closed region of interest. However, a preferred embodiment of the invention 
provides many other options to the human operator in terms of selecting objects and 
drawing regions of interest. For example, the region of interest need not be a closed 
drawing figure, but rather can be terminated at one or more sides of the image plane. 

30 Also, the drawing figure to define the region of interest need not simply be drawn 
with one continuous stroke of the drawing tool. The region of interest may be 
expanded by drawing additional strokes with the drawing tool to shade or fill in the 
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region of interest either to the inside or outside or both. Thus the operator may 
increase the width of the region of interest at particular portions of the ROI which had 
previously been designated and displayed. 

There also need not be a one-to-one relationship between regions of interest 
5 and objects selected for color correction. Thus, an object which surrounds an area 
which is not to be considered the object (e.g., a doughnut seen in plan view), may be 
defined by means of two unconnected regions of interest. In the example just given, 
one region of interest would be drawn to indicate the outer perimeter of the doughnut, 
and a second region of interest drawn to indicate the inner perimeter of the doughnut. 

10 To define the key boundary in this case, the image segmentation device performs two 
image segmentation processes, one constrained to the first region of interest and the 
second constrained to the second region of interest. 

A preferred embodiment of the invention also allows the operator to select 
several objects in the image for color correction simultaneously, using respective 

15 regions of interest to select each of the objects. For example, in the image shown in 
Fig. 6, the operator could draw a respective region of interest around each one of 
several of the flowers shown in the image, and the image segmentation device would 
then find the boundaries of each of the flowers to generate a key map made up of 
several disjoint parts. If more than one object is selected in an image, different 

20 luminance levels or color tints may be displayed in the respective regions 

corresponding to the selected objects, to indicate that different post-processes, such as 
different color correction processes, are to be applied to the various objects. 

The region of interest can also be processed in a "skeleton" mode (accessible 
by the control 328 shown in Fig. 6). When the ROI is processed in the skeleton mode, 

25 the image segmentation device automatically analyzes drawing figures generated in 
this mode to derive a "skeleton" of the drawing figure in accordance with known 
image analysis techniques. ("Skeleton" is a term of art that is well understood in the 
context of image analysis processing.) The resulting skeleton is then automatically 
designated to be an inside region. This mode is particularly useful when it is desired 

30 to select for color correction thin linear objects such as plant stems or birds 5 legs. The 
selection can simply be done by drawing a linear stroke of the drawing tool, in the 
skeleton mode, along the length of the object to be selected. If such a linear stroke is 
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drawn so as to be attached to an existing region of interest with an inside designated 
region surrounded by the region of interest, the interior region designated as the 
skeleton will be connected to the designated inside region. 

Another mode of operating the drawing tool, which may be referred to as a 
5 strand mode, is somewhat similar to the skeleton mode, but does not require analysis 
of a drawing figure to find the skeleton thereof. Instead, figures drawn using the 
strand tool automatically include a third line which appears on the screen parallel to 
and halfway between the two lines which are effectively defined by the left and right 
sides of the band drawn by the drawing tool. This inner line is automatically 

10 designated to be an inside region relative to the region of interest defined by the locus 
of the figure drawn by the drawing tool. Thus this tool simultaneously draws three 
free-hand lines in parallel to each other with equal spacing between the first and 
second line and between the second and third line. The region of interest is defined 
between the first and second line and between the second and third line with the 

15 second line itself being a narrow inside region. If such a tool is employed to draw a 
closed figure, the result would be three closed line figures with the second contained 
inside the first and the third contained inside the second. A first region of interest 
defined between the first and second line figures would be subjected to an image 
segmentation operation, as would a second region of interest defined between the 

20 second and third line figures. 

In addition to or instead of the free-hand drawing tools provided in the 
above-described embodiments of the invention, it is also contemplated to provide a 
drawing tool of the type, well known from computer drawing software packages, in 
which straight lines (of adjustable width) are drawn sequentially between control 

25 points selected by the operator. Such drawing tools are sometimes referred to as 

"connect the dots" tools. The present invention also contemplates a further alternative 
drawing tool to be used to designate a region of interest instead of or in addition to the 
tools described hereinabove. In accordance with this aspect of the invention the 
operator is permitted to draw a single fine line completely inside or completely 

30 outside the object of interest. The line may be automatically closed in accordance 
with conventional techniques if the operator so selects. An operator-actuatable 
control then causes the width of the line to be increased toward the inside or outside 
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of the object, as the case may be, until the widened line covers the object boundary. 
Preferably the widening of the line is continued by the operator until the line is more 
or less evenly divided by the object boundary. The widened line now may be taken to 
be a region of interest in which image segmentation may be performed. A drawing 
5 tool having some width can be used to adjust the region of interest by adding to it or 
erasing parts of it. It is also contemplated to employ the step of increasing the width 
of the line drawn by the operator to a line drawn with a tool having some substantial 
width, i.e. to a band-shaped region as previously described. 

Another drawing tool (indicated at 330 in Fig. 6), referred to as the 4< mesh" 
10 tool, is available to the operator to allow him or her to designate sections of the region 
of interest for carrying out a supplemental detail finding algorithm in areas where the 
desired object's boundary is highly complex. The designation of such a supplemental 
region of interest is indicated by block 41 1 in Fig. 3. Preferably the supplemental 
detail ROI sections generated at block 41 1 are displayed with a mesh pattern or other 
15 distinctive marking to distinguish it from the main ROI section 216 as seen in Fig. 8. 
(No such supplemental ROI section is shown in the drawings.) In the particular image 
shown in Fig. 6, it might be desirable to invoke complex boundary finding where the 
model's hair partially hides her forehead, as seen at 33 1 in Fig. 6. For purposes of 
segmenting images in a video clip, the supplemental ROI may be attached as an 
20 appendage to the main ROI, so that the supplemental ROI has its position changed as 
the main ROI has its position changed in accordance with practices to be described 
below. The supplemental ROI may be appended so as to fall along the center of the 
main ROI, or may have offset data associated with the supplemental ROI so that the 
supplemental ROI is appended inside or outside the rough boundary indicated by the 
25 main ROL It may also be desired to draw the entire ROI with only the mesh tool, in 
which case the main ROI and supplemental ROI are the same. 

Referring again, to Fig. 9, an extended region of interest (EROI) 236 extends 
both inwardly and outwardly from the ROI 216. No image segmentation process is 
carried out within the EROI, but key signal processing (to be described below) may 
30 occur within the EROI 236. Moreover, motion measurement may be constrained to 
occur only within EROI 236. 
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In effect, by drawing a region of interest, the operator has given to the image 
segmentation device a general indication of where in the image plane to find the 
boundary of an object selected by the operator. Based on the locus of the ROI, the 
image segmentation device proceeds to perform further processing to segment the 
5 image according to the guidance provided by the operator. This processing is 

represented by blocks 412 and 413 in Fig. 3. At block 412 the image segmentation 
device extracts luminance and color component information for the portion of the 
image corresponding to ROI 216. The process then may continue according to any 
one of a large number of feature extraction techniques. In a preferred embodiment of 
10 the invention simple (Sobel) edge detection processing is used to generate an 
"external force field" which drives a "snake" (active contour) to find the object 
boundary. 

First, the segmentation device calculates, for each pixel in the ROI, a metric 
which will be referred to as the "ROI distance" (Droi). Fig. 10 schematically 

15 illustrates how this data is calculated. In Fig. 10, a portion of an ROI 216 is shown, 
including a pixel 242 for which a DRO, is to be calculated. The calculation is based 
on the distance DI between the pixel 242 and a pixel 244 which is the closest pixel in 
the inside region 232 to pixel 242. The image segmentation device also defines a 
distance Do between pixel 242 and a pixel 246 which is the closest pixel in the 

20 outside region 234 to the pixel 242. The ROI Distance DROI is then calculated as 
Dc/Di+Do). 

On the basis of the ROI distance data calculated for the pixels of the region of 
interest 216, the image segmentation device goes on to calculate further data. As 
indicated by block 242, the image segmentation device defines a center locus of the 
25 region of interest as formed by pixels which have an ROI distance substantially equal 
to 0.50. This center locus for the ROI is indicated as the outer perimeter of a shaded 
area 216' in Fig. 11. If Fig. 1 1 is compared with Fig. 9, it will be observed that the 
shaded region in Fig. 1 1 has been reduced in width by one-half relative to the shaded 
area in Fig. 9. 

30 The ROI distance data is also used to calculate a metric called "ROI width" for 

each point in the ROL This is done by adding, for each point at the ROI center, the 
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respective Di (distance to nearest inside pixel) plus D D (distance to nearest outside 
pixel). Both of these measures were previously referred to in connection with Fig. 10. 

Furthermore a "relief map" is generated for the region of interest. The "relief 
map" is a vector field which is the derivative or slope in the X and Y (horizontal and 
5 vertical) directions of the ROI distance which was calculated for all of the pixels in 
the region of interest. 

The ROI center information and ROI width data are used to generate another 
kind of mapping data for pixels in the region of interest. This data is referred to as 
"edge axis" mapping data and is generated as follows. For each point at the ROI 

10 center, an average normal direction to the ROI center boundary, is determined. This 
direction is then compared to the following four default directions: "north-south", 
"east-west", "northeast-southwest", and "northwest-southeast". The one of these four 
default directions which is closest to the determined normal direction at the ROI 
center point being considered is selected, and then the selected default direction is 

15 assigned to all pixels surrounding the center point and within a distance from the 
center point equal to one-half the ROI width at the center point. Of course, this 
default direction data is not assigned to points outside of the region of interest. Also, 
since this process applies to each point on the ROI center, and because the center 
bends along its length, more than one of the default directions may be assigned to at 

20 least some of the pixels in the ROL The assigned default direction data are used, as 
will be seen, as an input to directional edge detection processes which will now be 
discussed. According to alternative embodiments, the normal direction at each point 
of the ROI center may be quantized to more or fewer than four values, or the raw 
normal direction itself may be stored as an input to a directional edge detection 

25 process. 

Feature Extraction 

As noted before, in a preferred embodiment of the invention, feature 
extraction is implemented as a conventional edge detection technique, such as the 
well-known Sobel edge detector, but modified for detection in a desired direction. 
30 Prior to application of edge detection or other feature detection processing, low pass 
filtering may be applied to the image data in one or both of the main ROI and the 
supplemental detail ROL In a preferred embodiment, the edge detector operates as a 
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convolution in luminance space on each pixel and its eight nearest neighbors with a 
separate convolution kernel for each direction. The luminance edge detection of 
block 256 is applied at each pixel location in the ROI in the one or more directions 
that were assigned to the pixel by the edge axis map generated at block 254, and the 
5 resulting edge data for each of the indicated directions is averaged. The luminance 
edge detector is not operated outside of the region of interest. 

A similar directional edge detection process is carried out with respect to color 
information for the pixels in the ROL Again, the edge axis mapping information is 
used to indicate the direction in which edges are to be sought at each pixel. This color 

10 edge detection process is applied to the color component information extracted at 
block 412 in Fig. 3. However, contrary to conventional practices, the color edge 
detection processing is not applied to the color information on a 
component-by-component basis. Rather, edge detection is performed on the basis of 
distances between pixels in a multi-axial color space. That is, the edge detection 

15 algorithm operates by calculating Euclidean distances among one or more pairs of 

nearest neighbors of the pixel in question, as measured in multi-axis color space. This 
is different from using simple subtractive distances in a single color component axis, 
as has been prescribed by the prior art. The color space in which the distances are 
calculated is defined by R-Y and B-Y axes in a preferred embodiment of the 

20 invention. However, it is contemplated to use other sets of axes, such as hue and 
saturation, and to use color spaces having more than two dimensions. The present 
inventors have found that applying color edge detection to Euclidean color space 
distances provides much more effective object boundary detection than the single 
component-based edge detection processing applied to color information according to 

25 the prior art. 

Like the luminance edge detector, the color edge detector may be a variant of 
a conventional edge detection process such as the Sobel detector, and is constrained to 
operate only within the region of interest. After the luminance and color edge 
detection processes are complete, it is preferred that the resulting data of each one be 
30 normalized to a range of 0-1.0, and then each is raised to a power such that the mean 
of each lies at the same value. These normalization and mean-matching steps have 
been found to provide optimum performance of subsequent operations. 
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Fig. 12 illustrates edge detection data generated by the directional luminance 
edge detector. It will be observed from Fig. 12 that the luminance edge detector 
found rather definite edges at areas indicated at 260, 262, 264, and 266 in Fig. 12. At 
other parts of the region of interest, such as those indicated at 268 and 270, the 
5 detector found rather weak or virtually nonexistent edges. 

Fig. 13 shows the edge detection information generated on the basis of the 
color image information. Strong edges are seen in Fig. 13 at 272 and 274 in Fig. 13, 
whereas no strong edges were found in areas 276, 278. 

The luminance-based and color-based edge detection information are then 
10 combined to produce combined edge information. The two edge detection maps may 
be combined additively, but preferably for each pixel the maximum of the luma and 
color edge detection data is taken to provide the combined edge data. Also, either one 
of the luma edge detector and the color edge detector may be disabled, as indicated at 
control portion 282 of Fig. 6. It is contemplated to combine additional feature maps 
15 generated by algorithms which extract features based on direct measurement and/or 
derivatives of one or multi-dimensional image parameter spaces, resulting in a 
combined feature map. 

The combined edge map is illustrated in Fig. 14. It will be observed from Fig. 
14 that a rather strong edge has been found virtually all along the region of interest. 
20 A weighting or biasing function is applied to the edge detection data across the 

transverse dimension of the region of interest so as to emphasize components of the 
edge detection data that are located toward the center of the region of interest. For 
that purpose a Gaussian or similar weighting function (such as a sinusoidal peak 
function) is applied across the region of interest, as schematically illustrated in Fig. 
25 15. 

The weighting function is applied in a manner which accommodates arbitrary 
shapes of the ROI The weighting function is defined over the range 0-1, inclusive. It 
will be recalled that an ROI distance metric has been defined for each point in the 
ROI and having values in that range which indicate the distance of the respective 
30 point for the inside and outside regions. The weighting function would be defined to 
have the value 0 for the 0 and 1 values of the ROI distance metric and a value of 1 for 
the 0.5 value of the ROI distance metric, with a suitable tapering for values of the ROI 
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di stance metric between 0 and 0.5 and between 0.5 and 1. Such a weighting function 
can easily be implemented as a lookup table. Since the ROI distance function is 
completely defined over any ROI of arbitrary topology, the weighting function can be 
defined for an ROI of any shape. As a result, the present invention can allow the user 
5 to designate an ROI having any arbitrary shape. 

As a result of the bias function, edges or other features detected near the center 
of the region of interest are favored, while those near the edge of the ROI are 
suppressed. This bias function effectively sharpens the guidance provided by the 
operator's input to guide the machine's image segmentation process toward the center 
10 of the region of interest. This reflects an assumption that the operator will attempt to 
more or less evenly bracket the desired object boundary with the inner and outer 
perimeters of the region of interest. 

The biased edge information is illustrated in Fig. 16. If Fig. 16 is compared 
with Fig. 14, it will be observed that the edge components toward the inside or outside 
15 perimeters of the region of interest have generally been reduced (de-emphasized). 
Conversely, the components of the edge information at a central portion of the ROI 
are emphasized by the bias function. 

Although the present inventors have found that the results of the edge 
detection processes are enhanced by using directional edge detectors based on the 
20 edge axis map, it has been found that adequate results may also be obtained by using 
non-directional edge detectors, in which case the edge axis mapping process may be 
dropped. It is also contemplated to use component-based color edge detection instead 
of the color-space-based edge detection which was referred to above. 

Boundary Finding 

25 The next step in the segmentation process is to locate the position of a 

boundary by using the extracted and processed feature map. In a preferred 
embodiment, snakes or active contours are used within a gradient vector flow (GVF) 
field to find the position of the object boundary. The biased edge detection data is 
processed to generate a gradient vector flow field, which is generally of the type 

30 discussed in the following papers: 
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C. Xu and J.L. Prince, ''Gradient Vector Flow: A New External Force for 
Snakes," Proc. IEEE Conf. on Comp. Vis. Patt. Recog. (CVPR), Los Alamitos: 
Comp. Soc. Press, pp. 66-71, June 1997; 

C. Xu and J.L. Prince, "Snakes, Shapes, and Gradient Vector Flow," IEEE 
5 Transactions on Image Processing, pp. 359-369, March 1998. 

Some of this material is also available in an article published online entitled 
"Gradient Vector Flow", by Chenyang Xu and Jerry L. Prince, found at 
iacl.ece.jhu.edu/project/gvf. 

The purpose of the gradient vector flow field is to provide an external force 
10 field for a snake or active contour segmentation process. 

The biased edge data is then used to calculate an edge gradient vector field, 
which may be a LaPlacian or derivative of the biased edged information. The 
resulting edge gradient field is shown in Fig. 17. This edge gradient data is then 
normalized and diffused throughout the region of interest to generate a data field 
15 referred to as the "edge relief map" or GVF field. The resulting map is illustrated in 
Fig. 18. 

The resulting GVF field is used in connection with another known image 
analysis technique referred to as an "active contour model" or "snake". The articles 
referred to above contain descriptions of image analysis using snakes, and so does an 

20 article entitled "Active Contour Models (Snakes)", which has been published online at 
www.cogs.susx.ac.uk/users/davidy/teachvision/vision7.html. A snake is a model that 
may be generated in a two dimensional image plane and includes control points along 
the length of the model which are deemed connected by virtual springs. The springs 
may reflect various models, but in a preferred embodiment of the present invention 

25 are modeled in accordance with Hooke's Law and have a rather low spring force with 
no resistance to bending. 

Essentially, the balance of the process for segmenting a single image entails 
generating a snake and allowing it to be driven to the desired object boundary through 
interaction of the gradient vector flow field and the snake's own internal energy 

30 characteristics. As is known to those who are skilled in the art, snake models operate 
to minimize the combined energy of the system in which they operate. 
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An initial position for the snake is set as the center of the region of interest, as 
described above in connection with Fig. 11. The snake is then iteratively repositioned 
under the influence of the edge relief map, until a convergence test is satisfied. Each 
time the snake is repositioned, it is reparameterized so that the spacing between the 
5 control points is substantially equalized. Also, the snake is not permitted to depart 
from the region of interest. It should be noted that it is not a common practice to 
constrain snakes within a user-defined region of interest. 

The convergence test calls for comparing the movement of the snake with a 
threshold to determine whether the snake has moved significantly in the latest 

10 repositioning. If the amount of movement is not significant, convergence is deemed 
to have occurred. The snake movement is measured by determining the amount that 
each control point has moved at the direction normal to the snake at the control point. 
The amount of movement in the normal direction is then averaged over the control 
points and the resulting average is compared with the threshold. Once convergence is 

15 found, the image is segmented at the final position of the snake and a key signal is 
produced. (The term "key signal" should be understood to include signals used for 
various types of image segmentation activities, including keying operations, mattes, 
windows, rotoscopes and "cutouts".) The locus of the final snake position defines a 
segmentation map for the image plane, the segmentation map being an output of the 

20 image segmentation process. The segmentation map corresponds to a 
machine-generated outline of the object selected by the operator. 

In Fig. 19 a rather bright outline 298 is indicative of the final snake position, 
and hence the locus of the segmentation map. It will be observed that the outline 298 
quite accurately indicates the boundary between the skin area selected by the human 

25 operator and the balance of the image. The segmentation map is also illustrated in 
Fig. 20 in the form of a "hard" key mask. 

If reasonably high-end PC hardware is used, the process of machine image 
analysis, from the time the operator indicates drawing of the ROI 216 is complete 
until the outline 298 is drawn by the computer, requires only a few seconds or less. 

30 The process of drawing the ROI itself also need only take a few seconds. 
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In the embodiments described herein, snakes (active contour models) have 
been used to find object boundaries in the image. However, the invention 
contemplates using boundary locating techniques other than snakes. 

Moreover, assuming snakes or similar techniques are employed, there are 
5 many possible variations in the manner of calculating an external force field for the 
snake. 

Although the snake-based segmentation process described above often 
produces a very accurate result in terms of finding the desired object boundary, it is 
also contemplated (although not essential) to apply "high lever constraints to the 

10. segmentation process to further improve the accuracy of the process. The application 
of high level constraints is represented by block 414 in Fig. 3. If a snake is employed 
to find the boundary, certain high level constraints, such as continuity or closure, 
smoothness, resistance to bending and elasticity, may be inherent in the processing of 
block 413. But if other boundary finding techniques, such as Canny edge detection, 

15 are employed at block 41 3, then high level constraints such as those enumerated 
above may be applied at block 414. 

Another high level constraint that may be employed at block 414, even when a 
snake is employed at block 413, is referred to as "shape memory'\ In essence, shape 
memory may be applied at parts of the nominal boundary where the edge detection 

20 information is weak (exhibits low confidence). Assuming that the image segmented 
in block 413 is not the first in a scene, the shape of the outline in the low confidence 
region is clipped from the corresponding portion of the boundary outline in an 
immediately preceding image, and the clipped outline segment is spliced into the 
region of low confidence. If the image is the first in a scene, the low confidence 

25 portion of the outline may be replaced by the corresponding segment of the initial 
snake position along the ROI center. Since it is possible that the outline has moved 
and/or changed shape firm image to image, the clipped segment must be transformed 
to match the new outline. This is done by measuring the transformation at the regions 
of good confidence adjoining the splice and interpolating over the length of the splice. 

30 To make this possible, each control point on the snake carries or drags with it certain 
parametric data as the snake is iteratively repositioned in block 413. This parametric 
data is stored in a shadow data structure, and may include edge strength data (relevant 
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to the "shape memory" feature, and also relevant to the "edge modulated softness" 
feature described below), ROI width data (relevant to ROI relocation, to be discussed 
below), and prior positions of the control points (relevant to the above-mentioned 
convergence test). As is known, the reparameterization of the snake during block 413 

5 may result in control points being added or dropped. If a control point is dropped, its 
corresponding shadow data in the shadow data structure is also dropped. If a control 
point is added, the corresponding shadow data for the new control point is augmented 
by interpolation or the like from the shadow data for neighboring control points. 
A variation of shape memory, referred to as "shape history", may also be 

0 employed at block 414. If the shape history constraint is applied, more than one prior 
image is considered in forming a splice for a region of low edge confidence. The 
outline data from the prior images may be accumulated by adding the coordinates for 
each successive outline to a prior average. The scaling factors applied to the most 
recent outline data and to the running average may be varied to provide a variable 

5 degree of persistence to the outlines of the prior images. 

The threshold for the confidence measure, to be used to determine whether 
shape memory or shape history is applied, may be subject to hysteresis. That is, the 
threshold may be set higher at portions of the outline for which low confidence was 
found in a prior image. 

3 Also, an average confidence measure may be computed for the outline as a 

whole, and a graphical display element such as a bar graph may be displayed based on 
the average confidence measure to provide an indication of the success of the 
segmentation process. The average confidence measure can be thought of as a figure 
of merit for the segmentation process. 

5 This figure of merit may be particularly useful in the object tracking process to 

be described below. When the confidence measure declines significantly from one 
image to the next, it may be assumed that there has been a change of scene, or loss of 
tracking of the desired object for some other reason. In these cases, the operator may 
be prompted to designate a new ROI. 

) Another high level constraint that may be applied at block 414 is shape 

reluctance. This constraint may be described as a resistance to changing shape, even 
in areas of high edge confidence. This constraint may be applied either as a simple 
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gate, specifying a maximum permitted deviation in shape, or as a continuous function, 
whereby indicated changes in shape are scaled nonlinearly. 

Another high level constraint that could be applied at block 414 would be to 
require the operator to manually adjust the outline at regions which exhibit low edge 
5 confidence. Operator adjustment of the outline will be discussed further below. 

Still another high level constraint that could be applied at block 414 would be 
temporal smoothing of the outline over multiple images. This technique would 
eliminate or minimize contributions from noise or other perturbations (such as edges 
which impinge on the object of interest from the background) which momentarily 

10 disrupt the outline shape. Temporal smoothing could be accomplished by imposing a 
running average on outline coordinates and/or limiting the rate of derivatives of 
outline coordinates. 

Once the segmentation of the first image is complete, the region of interest is 
repositioned in the image plane taking the segmentation map locus (final snake 

15 position or outline) as the center of the repositioned region of interest. (This step is 

represented by block 415 in Fig. 3.) Preferably the recentering of the region of interest 
may be performed adaptively along the segmentation map, in the sense that for a 
given point along the segmentation map the region of interest is not repositioned 
unless the segmentation reflects a high degree of confidence that the object boundary 

20 was properly found. In other words, the region of interest may only be re-centered for 
points of the segmentation map at which a rather definite edge was found. At other 
portions of the segmentation map, the shape of the region of interest remains 
unchanged. The process of re-centering the region of interest may use the ROI width 
metric to set the borders of the region of interest relative to the image segmentation 

25 map. The width metric which is employed at any given point on the image 
segmentation map is the width metric which was originally assigned to the 
corresponding point on the snake obtained from the ROI width when the snake was at 
its initial position at the center of the ROI as drawn by the operator. The re-centering 
may alternatively use one of a variety of other techniques including morphing and 

30 patch displacement interpolation. 

Once the ROI has been repositioned to be centered on the final outline 
position, the ROI products (distance and relief map) may be recomputed to improve 
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the modulation of the final key shape. The EROI (Extended) ROI 236, referred to in 
connection with Fig. 1 1, is repositioned along with the ROL 

Blocks 416 and 417 in Fig. 3 represent processes in which the operator is able 
to provide additional input to the boundary finding procedure, beyond the initial 
5 indications of the ROI provided by the operator at blocks 410 and 411. 

Block 416 represents an operator adjustment to one or both of the ROFs 
designated at steps 410 and 411. The operator adjustment of the ROI(s) may be 
performed iteratively after block 414 and/or block 412. The operator may use the 
same drawing tools referred to in connection with blocks 410 and 41 1 to add to the 
10 previously designated ROI(s). 

In addition, it is preferred to also provide an erase function whereby the 
drawing tool, when applied to the region of interest, causes the region of interest to be 
erased. If the erase tool enters the region of interest from the adjoining region 
designated as the outside, the erased part of the region of interest is joined to the 
15 outside region. Conversely, if the erase tool enters the region of interest from the 
adjoining region designated to be the inside, the erased portion of the region of 
interest is added to the inside region. 

Block 417 represents an option provided to the human operator to permit 
adjustment of the segmentation map (outline). As illustrated indicated at 300 in Fig. 
20 21 , the operator may select an "adjust outline" option. Upon selecting this option, the 
operator is provided with a software drawing tool having quite a narrow width. By 
means of this drawing tool, the operator can use the drawing device 122 (Fig. 2) to 
erase and redraw portions of the outline 298 to make corrections in the segmentation 
map generated by the machine image analysis process. 
25 Boundary Finding in Dynamic Images 

The process of Fig. 3 next turns to segmenting a second image in the sequence 
of images. For that purpose, and as indicated block 419, there must be provided an 
indication of the motion of the object of interest (or, more precisely, of the boundary 
of the object). To provide the motion indication, actual motion from the first image to 
30 the second image to be segmented may be measured (e.g., via optical flow 

techniques). If no motion is detected, the outline and ROI from the previous image 
may be used without change. 
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Alternatively, motion of the object boundary may be estimated (e.g., via 
extrapolation techniques). If motion is to be estimated, the locus of the boundary 
outline must have been established for at least two images prior to the image to be 
segmented; accordingly, the operator may be required to manually indicate the ROI 
5 (block 410) for two or more images at the beginning of each scene if motion 
projection (extrapolation) is employed. 

Motion extrapolation techniques involve developing a measure of gross 
geometric parameters of the outline such as the centroid, size, differential scale 
factors, and rotational orientation. These geometric parameters are then used to 
10 predict the future position of an object based on prior positions using a model of 

Newtonian mechanics. Velocity and acceleration in x and y directions can be tracked 
by examining the movement of the centroid and the same factors in the z direction can 
be tracked on the basis of change in size. Angular velocity and acceleration about the 
major, minor and z axis may also be measured by examining changes in differential 
15 scale and rotational orientation. All of these motions may be extrapolated based on 
constant velocity, acceleration, or rate of change of acceleration ("jerk")- 

Motion measurement may be performed in accordance with any one of a 
number of techniques. In a preferred embodiment the optical flow technique of Horn 
and Schunck is used. This technique is described in: Horn, B.K.P., and Schunck, 
20 B.G., "Determining Optical Flow", Artificial Intelligence . 17, pp. 185-204 (1981). 

As reported in the literature, many optical flow techniques are unable to 
measure high velocities, unless a supplemental technique, known as a pyramid, is 
employed. The latter technique uses multi-scale and multi-resolution flow 
measurements. Filtered and subsampled (reduced) versions of the images are 
25 processed to extract high velocity components, which are then passed down for 
refinement to higher resolution versions of the image. 

In a preferred embodiment of the invention, conventional optical flow 
techniques have been modified by applying the same to both luminance and color 
information in an extended region of interest (EROI). In this embodiment, the image 
30 information in the EROI is normalized before applying optical flow detection 
processing. 
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• According to a preferred embodiment of the invention, motion measurement 
or estimation is constrained to be carried out within a portion of the image plane 
which corresponds to the ROI and EROI, as taken together and after repositioning in 
accordance with block 415. It was noted that the EROI is also used as an area in 
5 which key signal modulation (e.g., a "softness" function) may occur. However, it is 
also contemplated to provide separate extended ROI's, respectively, for key signal 
modulation and motion measurement/estimation. 

On the basis of the motion indication obtained at block 419, the outline 
obtained in the first image at block 414 is reshaped and relocated to reflect the motion 

10 of the object between the first and second images. Then the ROI(s) are repositioned 
so as to be centered on the repositioned outline (block 418). The image segmentation 
device now has an indication of where to find the object boundary in the second 
image, and may now proceed with the processes of blocks 412-415 with respect to the 
second image, so that the second image can be segmented very accurately without 

15 operator input. After segmentation of the second image is complete, steps 419 and 

418 may be carried out to prepare for segmentation of a third image. Indeed, the loop 
of steps 412, 413, 414, 415, 419, 418 may be carried out sequentially and 
automatically to segment a large number of images which make up a video clip. It 
should be understood that the outlines used for object tracking in accordance with 

20 steps 418 and 419 may differ in some respects from the outlines used to generate key 
signals for individual images. For example, the latter outlines may reflect operator 
adjustments that are not applied to the outlines used for tracking. 

It is contemplated to apply the above-described process of finding a boundary 
in a sequence of dynamic images to a recorded video clip or to a live sequence of 

25 video images, or to the output of a telecine or other source of video images. It is also 
contemplated to perform the process in real time, or slower than real time, or faster 
than real time. 

According to a preferred embodiment of the invention, the image 
segmentation device stores the ROFs, the outlines, unmodified key signals for each 
30 image, key modification settings (and also color correction data, if the image 

segmentation device itself is arranged to perform color correction) for each image in a 
video clip as the image segmentation process proceeds through the clip, either with or 
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without intervention by the operator. After the processing of the clip is complete, the 
operator may review the sequences of data that were generated and stored for the clip. 
The operator can select outlines, ROI's, key modification settings and so forth for 
adjustment. The adjustments may be applied to an individual image or to a selected 
5 number of images subsequent to the image for which the adjustment is made. 
Depending on what adjustments are made, it may only be necessary to perform a 
subset of the image segmentation and object tracking processes described above. The 
data referred to in this paragraph, whether prior to or after adjustment, may be 
archived to provide a complete record of the image processing applied to the video 
10 clip. 

Output Signal Options 

Reference is now made to Fig. 4, which illustrates other processes carried out 
by the image segmentation device, primarily in regard to providing output signals 
based on the final boundary outline(s) produced at block 414. 

15 One user-controlled option, represented by block 420 in Fig. 4 and actuated 

through a slide bar 306 shown in Fig. 19, allows the operator to increase or decrease 
the size of the outline (Fig. 20). That is, the operator may use the slide bar 306 to 
increase or decrease the amount of space (indicated as white in Fig. 20) which is 
inside the outline. The direction in which the key mask is adjusted in response to the 

20 operator's input is determined, at each point along the outline, on the basis of the ROI 
relief map which was referred to above (and was calculated, e.g., as a derivative in the 
x and y directions of the ROI distance metric). In this way, even the arbitrarily 
shaped and highly irregular key masks produced by the techniques of the present 
invention can be appropriately resized without significant distortion. In general, the 

25 operator will decide whether to enlarge or reduce the size of the outline, or to leave it 
unchanged, based on factors such as relative brightness or darkness of the object of 
interest relative to the background, color contrast between the object and the 
background, the nature (e.g., definiteness) of the boundary of the object, the type of 
post processing for which segmentation has been performed, and so forth. At block 

30 421 an "area-fill" operation is performed for the area inside the (optionally re-sized) 
outline to generate a key signal to select the desired object, and to de-select the 
balance of the image (background). 
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At this point, as represented by block 422, the operator is provided with a 
number of options for modifying the resulting key signal. 

According to one option, the operator can use the softness slide bar 310 (Fig. 
19) to adjust softness at the edge of the key mask, in accordance with conventional 
5 practices. As is known to those of ordinary skill in the ail, when using the softness 
adjustment option, the operator varies the width of the transition region between a 
region of full keying (key = 1) and a non-keyed region (key = 0). That is, the key 
signal takes on values between zero and one in the transition region and hence 
exhibits a gradient. In Fig. 5 A, a curve 312, having a conventional "S" shape, is 
1(X indicative of a rather wide transition region, corresponding to a rather large degree of 
softness. When the softness is reduced, an "S" curve 314 is produced, defining a 
narrower transition region in which a somewhat steeper slope is present. 

Fig. 22 is indicative of a key map to which a degree of softness has been 
applied, and may be compared with the hard key map of Fig. 20. 
15 Fig. 23 illustrates how the softened key map of Fig. 23 selects desired portions 

of the processed image. 

A "clean up" option is also provided to the operator, which is accessible via 
slide bar 318 (Fig. 19). 

The clean-up function allows the operator to increase or reduce the degree of 
20 softness toward the outside of the transition region without affecting the softness 

profile toward the inside of the transition region. As illustrated in Fig. 5B, the curve 
312 again illustrates a conventional softness profile, whereas, at 320, the softness 
profile is "hardened" (i.e. given a steeper slope), in response to the operator's 
invocation of the clean up function, on the side toward an outside (background) region 
25 relative to the keyed portion of the image. As will be observed from Fig. 5B, the 
softness profile remains unchanged on the side of the profile which is towards the 
inside (keyed) region of the image. Thus, the slope of the gradient of the softness 
function is increased, by invocation of the cleanup function, toward the side adjacent 
to the outside region but not toward the side adjacent to the inside region. 
30 Still another option provided to the user is accessible via the slide bar 324 

shown in Fig. 19 and may be referred to as an "edge-modulated softness" function. 
When the operator actuates this function, the image segmentation device causes the 
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degree of softness to be varied along the perimeter of the key map on the basis of 
edge information which had previously been generated from the image information 
representing the displayed image. Thus, at points on the perimeter of the key map 
where a more definite edge was detected, the degree of softness is made lower, while 
5 at points on the key map where less edge definiteness was detected, the degree of 
softness is made greater. The adaptation or modulation of the degree of softness 
along the key map perimeter may be based on the edge detection data which is 
produced with respect to luminance or color data, or based on both. When this 
edge-modulated softness feature is invoked, the degree of softness provided at any 
10 particular point on the key map perimeter depends both on the pertinent edge 
detection data for that point and the setting of the slide bar, with the slide bar 
controlling the overall extent to which the softness is adjusted based on the edge 
detection data. 

Once the operator has completed all adjustments to the key signal which he or 
15 she considers desirable, the resulting key signal is output from the image 

segmentation device 106 to the color corrector 104 (Fig. 1). The key signal is then 
used in the color corrector 104 as a window to select a portion of the corresponding 
image in which a color correction process is carried out. Because the key signal has 
been accurately matched to the shape of the desired object, by a unique interplay of 
20 human and machine intelligence, a highly satisfactory color correction process can be 
achieved. The key signal produced by the techniques of the present invention is also 
suitable for use in image compositing operations. One potential use of the key signal 
would be application to the original image to produce an image of the desired object 
alone. Conventional spill suppression techniques may then be applied to the isolated 
25 image. Such techniques are well known in the industry and serve to eliminate 

background color contamination from the edges of an isolated object. Both the keyed 
image and the combined key signal could be provided to a compositing system, either 
directly or after storage in the image segmentation device 106 or in the image source 
device 102. 

30 Supplemental Detail Finding 

There will now be discussed supplemental detail finding operations carried out 
on the basis of the supplemental ROI designated at block 41 1 of Fig. 3. It will be 
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recalled that the supplemental ROI was drawn at portions of the object of interest 
where the boundary was highly complex, say the fur on an animal, or the outer extent 
of branches and leaves on a tree. The generation of the key signal for images of this 
type may proceed by color or luminance keying techniques or by detecting spatial 
5 frequency characteristics within the supplemental ROL According to a preferred 
color keying technique, the image segmentation device parses the color information 
for pixels that are "inside" the object (based on the inside region designated with 
respect to the primary ROI of block 410), and those which are outside. Then, for each 
pixel in the supplemental ROL it is determined whether its color is one that is (a) only 

10 found among "inside" pixels; (b) only found among "outside" (background) pixels; or 
(c) found both inside and outside the object. Pixels falling within category (a) are 
considered to be part of the object; pixels falling within category (b) are excluded 
from the object; and those in category (c) are assigned to the object or not, depending 
on a variety of criteria, such as whether they are closest to category (a) or category (b) 

15 pixels. Differentiation between inside and outside content in the supplemental ROI 

may be based on a variable-sized local area adjacent to each pixel instead of the entire 
supplemental ROL This local area is evaluated for each pixel or group of pixels and 
may be defined by a radial distance (circle) or a shape or shapes drawn by the 
operator. 

20 A similar keying process may be carried out based on luminance level 

distribution. 

Another supplemental boundary-finding technique that may applied in 
conjunction with the supplemental ROI entails using a bandpass filter to detect higher 
frequencies, where the object has a more complex texture than the background. 
25 The detail key resulting from block 423 may be resized at block 424 by 

logically anding with a resized version of the outline key. In this way the operator 
can control the extent to which supplemental detail is found over the primary ROL 
At block 425 the complex boundary key can be modified by use of some of 
the functions referred to at block 422, including the "clean-up" function and the 
30 conventional softness function. 

Block 426 indicates that the detail key may be combined with the outline key. 
This may be done in a variety of ways, including selecting the maxima of the 



BNSDOCID- <WO 0126050A2J_> 



WO 01/26050 PCT7US00/27347 

-32- 

respective keys, or running each through a respective variable gain stage to obtain a 
weighted sum of the two keys. It is also contemplated to apply a variable weighting 
between the outline key and the detail key at particular portions of the image based on 
characteristics of the image and the ROI. It may be desirable to increase the 
5 weighting in favor of the detail key in areas where the confidence of the edge 

detection is low, indicating a poorly defined edge. It may also be desirable to increase 
the weighting in favor of the detail key where the ROI is relatively wide, since this 
too may indicate a poorly defined boundary for the object. The resulting combined 
key can be used for the same purposes as the outline key, including color correction 

10 and image compositing. 

The balance of the processes indicated in Fig. 4 are concerned with converting 
the outline established at block 414 (Fig. 3) to other data forms which are useful in 
applications other than color correction. At block 427 a known technique is employed 
to form an approximation of the outline using Bezier curve spline segments. 

15 Optionally, as indicated at block 428, the splines created at block 427 may be 

automatically removed in regions where the edge data exhibited low confidence. In 
those regions, the operator would manually insert replacement splines to more 
accurately follow the object boundary. At further optional steps 429 and 430, 
respectively, temporal smoothing may be applied to the splines, and the operator is 

20 permitted to manually adjust the splines. The temporal smoothing step includes 
techniques such as temporal averaging or derivative rate limiting to insure that the 
splines move smoothly while tracking the object of interest in a dynamic image 
stream. In block 430, the operator can use well known techniques employed in 
computer animation and compositing such as moving spline control points and 

25 manipulating control point "handles" to modify parameters of spline segments. 

The resulting spline data, as optionally modified at blocks 428-430, may be 
employed for applications such as compositing and computer animation. 

Block 431 relates to outputting useful geometric data which describe the 
object of interest on the basis of the outline drawn at block 414. The geometric 

30 parameters used at block 419 for motion estimation are of interest if the system is to 
be used to synchronize external motion control equipment (not shown) with live or 
recorded images or to provide motion tracking information for creating animated or 
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live action material which is to be matched with preexisting material. The geometric 
parameter data describing the position and orientation of the boundary outlines can be 
conveyed by standard interfacing techniques to the external equipment. Also, control 
points can be specified and tracked within or along the boundary outline for use in 
.5 compositing applications. 

Feature Discrimination Based on Adjacent Image Parameter Data Signatures 

In the segmentation method described herein, a detected feature map (edges in 
a preferred embodiment) in a region of interest is derived from image parameter data 
(luminance and chrominance in a preferred embodiment), and features that are not in 

10 a central portion of the region of interest are suppressed by a bias function. In 

addition to or instead of using a bias function to suppress features that are unlikely to 
be of interest, it is contemplated to use a priori knowledge from a prior image or 
images (or based on operator input) to suppress features in an image in a video clip 
based on characteristics of adjoining pixels. Such a process may be implemented as 

15 follows. 

Once the object boundary outline has been located and adjusted in a first 
image, adjacent image parameter data on the object side of the boundary is computed 
for each point along the boundary. This computation develops a "signature" of the 
adjacent image parameter data for the object of interest, and may be based on 

20 luminance data (e.g., average luminance over some distance toward the interior of the 
object), chrominance data, texture/spatial frequency content or some combination of 
these characteristics. The resulting signature data can be expected, in many cases, to 
differ from the corresponding characteristics of areas immediately outside of the 
object. Instead of the simple average luminance signature referred to above, other 

25 signatures are possible, including, for example, mean and variance of luminance and 
chrominance values within a neighborhood of the vector normal to the boundary 
outline. The contribution of each pixel within the neighborhood could be weighted 
inversely to the pixel's distance from the outline. 

Signatures could also be generated for the pixel data outside of the boundary, 

30 or, respectively, both for inside and outside. 

It will be recalled that the process for segmenting a video clip, as described 
above, calls for relocating the boundary outline for the prior image based on a 
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measure of inter-image motion, and the region of interest is then re-centered on the 
new outline position. Then a detected feature map is computed for the current image 
in the re-centered region of interest. The signature data described in this section may 
be employed to discriminate among the detected features, suppressing those which do 
5 not have a similar signature. 

Since the detected features may be distributed throughout the region of interest 
and the signature information is only defined at the location of the boundary outline, 
the signature information and normal vector directions must be diffused throughout 
the region of interest to permit evaluation of all detected features. This is 

10 accomplished by creating a "signature map" which consists of data arrays for the 
region of interest, one for each signature parameter and one to encode the normal 
vector direction. Points in these arrays which correspond to the boundary outline 
location are initialized from the corresponding signature information and vector 
direction of the pixels at the boundary outline. The signature map is then developed 

15 by a morphological dilation operation. Each new pixel resulting from the dilation 
inherits signature and vector direction information from the neighboring pixel or 
pixels of the previous iteration of the signature map. If there is only one neighboring 
pixel, then the same data values are carried over to the new pixel; if there is more than 
one neighboring pixel, the data of the neighboring pixels may be averaged to generate 

20 the data for the new pixel. 

Once the signature map has been constructed, the detected features may be 
evaluated in terms of signature data and those lacking appropriate signatures may be 
suppressed. For each point in the ROI (or only those exhibiting at least a certain level 
of the detected feature (e.g., edge)), a signature calculation is performed, using the 

25 vector direction at the corresponding position in the signature map. The resulting 
signature data is compared to the signature data at the corresponding position in the 
signature map, and the feature is suppressed in inverse proportion to the degree of 
matching between the calculated signature data for the point and the corresponding 
signature map data. This would tend to suppress features found outside of the object. 

30 As a possible additional feature, the signature may also be calculated in the opposite 
to the normal direction (i.e. toward the "outside"), and the feature suppressed in 
proportion to the degree of matching with the signature map data. This would have 
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the effect of suppressing features which have the same signature on both sides, which 
are likely to be inside the object of interest. 

The signature map may be based on a temporal average of prior images 
instead of just one prior image. The operator may also be permitted to selectively 
5 enable or disable feature discrimination based on signatures for objects which have 
changing interior parameters such as varying illumination levels or changing colors. 
For such objects it may be preferable to discriminate features based only on the 
background signature. 

Other Embodiments 

10 Referring to Fig. 1 1, it will be noted that, at the right side of the screen 

display, a rather large set of control options 332 is provided, permitting the operator to 
select among many screen displays of either intermediate data generated by the image 
segmentation device or final outputs such as the key signal which results from the 
image segmentation process. However, in a preferred commercial embodiment of the 

15 invention, it is contemplated to reduce the complexity of the user interface and to 

improve operability by limiting the screen display selection options to "Outline Key" 
(corresponding to Fig. 22), "Keyed Image" (Fig. 23), and "Outlined Image" (Fig. 19). 

In addition to the novel, operator-guided automatic image segmentation 
techniques disclosed herein, it is also contemplated to incorporate in the image 

20 segmentation device 106 conventional features which allow the operator to perform 
segmentation by drawing and animating simple geometric shapes. Segmentation by 
these known techniques may be adequate when the object of interest itself has a rather 
simple shape. The shapes that may be selected by the operator may include ellipses 
and quadrilaterals and may be subjected to known geometric transform control 

25 functions such as size, position, rotation, aspect and trapezoid. 

The resulting key signals could also be subjected to the types of key 
modification processes referred to above. 

The animation of the selected (and possibly transformed) shapes can be 
carried out with conventional key frame and path techniques, which need not be 

30 described further. It is also contemplated to extended these conventional practices by 
attaching one or more simple geographic shapes to the outline generated for tracking 
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purposes according to the inventive procedure described above. The permits a 
geometric shape to be located relative to a tracked object boundary or portion thereof. 

The invention has been described primarily in the context of a color correction 
system of the type used for film-to-tape or tape-to-tape post-production work. 
5 Specifically the invention has been described as a peripheral device to be connected to 
a color corrector to provide key signals to the color corrector. Nevertheless, many 
other applications of the teachings of the present invention are contemplated. The 
software processes described herein could be advantageously applied to and embodied 
in many other kinds of image manipulation equipment including video special effects 

10 devices and devices used for colorizing black-and-white motion pictures. It is also 
contemplated to include software embodying the present invention in commercially 
distributed software packages used for desktop publishing or for manipulating clip art 
images. The image segmentation capabilities of the invention are further applicable 
to image compositing operations and manipulation of still images generally (both 

15 color and black and white), including pre-press image processing. Another potential 
application of the present image segmentation techniques is in computer-aided-design 
and computer-aided-manufacturing software packages. The invention may also be 
used to perform image segmentation as an input to 3-D simulation processes. 
Software which embodies the present invention may be stored in 5 various types of 

20 digital memories such as RAM, hard disks, CD-ROM's and DVD's. 

Previous discussion of Fig. 1 indicated that the key signals produced by the 
process of Figs. 4 and 5 are outputted from image segmentation device 106 to color 
corrector 104. However, a number of variations and alternatives are also 
contemplated. For example, the functions of the image segmentation and color 

25 correction blocks 106, 104 may be integrated in a single device. Moreover, the key 
signals, other data produced from the processes described herein, and/or processed 
images may be stored in the mass storage 1 16 (Fig. 2) of the image segmentation 
device 106 and/or in a storage device which serves as the image source 102 (Fig. 1). 
Further, if the color corrector does not have a capability for receiving an 

30 external key input, an external keyer may be connected after the color corrector to 
receive key signals from the image segmentation device, combining uncorrected 
image data from the image source and corrected image data from the color corrector 
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on the basis of the key signal. In addition, the image segmentation device 106 may 
itself be arranged to perform keying operations on the images from the image source 
102 and image signals from the color corrector, based on key signals which the 
segmentation device itself generates. Also, as has been stated above, the 
5 segmentation device 106 may itself perform both the keying and color correction 
operations. 

Although particular illustrative embodiments of the present invention have 
been described herein with reference to the accompanying drawings, the present 
invention is not limited to these particular embodiments. Various changes and 
10 modifications may be made thereto by those skilled in the art without departing from 
the spirit or scope of the invention, which is defined by the appended claims. 
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What is claimed is: 

1. A method of dynamically segmenting an image plane on the basis of features 
of a dynamic sequence of images displayed in the image plane, the method 
comprising the steps of: 

5 displaying on a display device a first image of the sequence of images; 

designating a region of interest in the image plane; 

applying an image segmentation algorithm to said first image to generate a 
first outline in said region of interest, the image segmentation algorithm being 
constrained to operate only within said region of interest; 
10 applying an algorithm to provide an indication of motion, between said first 

image and a second image of said sequence of images, of an object corresponding to 
said first outline; 

repositioning said region of interest on the basis of the indicated motion of 
said object; and 

15 applying said image segmentation algorithm to said second image to generate 

a second outline in said repositioned region of interest. 

2. A method according to claim 1, wherein said indication of motion is a motion 
estimate projected from images generated prior in time to said second image. 

3. A method according to claim 1, wherein said indication of motion is based on 
20 a measurement of motion between said first and second images. 

4. A method according to claim 3, wherein said measurement of motion is based 
on a measurement of optical flow. 

5. A method according to claim 1, wherein said image segmentation algorithm 
includes an edge detection algorithm. 

25 6. A method according to claim 5, wherein said edge detection algorithm uses a 
snake driven by a gradient vector flow (GVF) field. 

7. A method according to claim 6, wherein said GVF field is generated by 
normalizing and then diffusing edge information generated by a Sobel edge detector. 

8. A method according to claim 1, further comprising the steps of: 

30 prior to the first applying step recited in claim 1, designating a first region 

bordered by said region of interest as an inside region and designating a second region 
bordered by said region of interest as an outside region; 
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for each pixel in said outline generated in said first applying step recited in 
claim 1, calculating an ROI width metric as the sum of (a) a distance from the 
respective pixel to a nearest pixel in the inside region and (b) a distance from the 
respective pixel to a nearest pixel in the outside region; and 
5 prior to said repositioning step, recentering said region of interest relative to 

said outline by using width metrics calculated for the pixels of the outline. 
9. A method of segmenting an image plane on the basis of features of an image 
displayed in the image plane, the method comprising the steps of: 

displaying the image on a display device; 
10 using a drawing device to superimpose a free-hand drawing figure on the 

image displayed on the display device, said free-hand drawing figure defining a 
band-shaped region of interest in the image plane formed as the locus of a circle 
moved in an arbitrary manner; 

applying an image analysis algorithm to the displayed image, said image 
15 analysis algorithm being constrained to operate only within said region of interest 
defined by said free-hand drawing figure, said image analysis algorithm operating 
without reference to any portion of said image outside of said region of interest: and 

segmenting the image plane on the basis of a result provided by application of 
said image analysis algorithm. 
20 10. A method according to claim 9, wherein said image analysis algorithm 

includes an edge detection algorithm which produces edge information and further 
comprising the step of de-emphasizing components of the edge information that are 
not located at a central portion of said region of interest. 

11. A method according to claim 9, wherein said image analysis algorithm 
25 includes: 

applying edge detection processing to luminance information in said region of 
interest to produce luma edge information; 

applying edge detection processing to color information in said region of 
interest to produce color edge information; and 
30 combining the luma edge information and the color edge information to 

produce combined edge information. 

12. A method according to claim 9, wherein said segmenting step comprises: 
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computing an external force field within said region of interest; 
initializing a snake within said region of interest; 

iterative! y repositioning the snake within said region of interest on the basis of 
the external force field until a convergence test is satisfied; and 
5 segmenting the image plane on the basis of a repositioned snake which 

satisfies the convergence test. 

13. A method according to claim 9, further comprising the step of designating a 
region bordered by said drawing figure as an inside region. 

14. A method according to claim 13, further comprising the step of using the 
10 drawing device to erase a portion of said drawing figure, an area corresponding to 

said erased portion being 5 added to the inside region. 

15. A method according to claim 9, further comprising the step, performed after 
said drawing figure has been superimposed on said displayed image, of using the 
drawing device to increase the width of said region of interest at a selected portion of 

1 5 said region of interest. 

16. A method according to claim 9, wherein said drawing figure is indicated on 
the display device by altering a luminance level in the locus of the drawing figure. 

17. A method according to claim 9, wherein said drawing figure is indicated on 
the display device by altering a magnitude of at least one color component in the 

20 locus of the drawing figure. 

18. A method according to claim 9, further comprising the steps of: 
designating a first region bordered by said drawing figure as an inside region 

and designating a second region bordered by said drawing figure as an outside region; 
and 

25 calculating a distance metric for each pixel in the region of interest, the 

distance metric for the respective pixel corresponding to the ratio of (a) a distance 
between the respective pixel and a nearest pixel in the inside region, relative to (b) a 
distance between the respective pixel and a nearest pixel in the outside region. 

19. A method according to claim 18, further comprising the steps of: 

30 generating a key map on the basis of a result provided by application of said 

image analysis algorithm; 
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generating a vector field in said region of interest on the basis of a slope 
function derived from the distance metrics for the pixels of said region of interest; 

receiving a command to resize said key map; and 

resizing said key map on the basis of said vector field. 
5 20. A method of applying an edge detection algorithm to color information 

corresponding to pixels arrayed in an image plane, the method comprising the step of: 

calculating, for each of said pixels, a distance value in color space between a 
pair of neighboring pixels, said color space being defined by plural color axes. 

21 . A method of providing a softness function with respect to a key signal, the 
10 method comprising the steps of 

generating a key boundary by means of an edge detection algorithm, said 
algorithm generating for each pixel on said key boundary edge-degree data indicative 
of a degree of definiteness of an edge at the respective pixel; and 

adjusting a softness function along said key boundary in dependence on said 
1 5 edge-degree data . 

22. A method according to claim 21, wherein said softness function is adjusted 
such that the softness function varies inversely with said edge-degree data. 

23. A method of adjusting a softness function with respect to a key signal, the 
method comprising the steps of: 

20 generating a key boundary; 

designating a first region bordered by the key boundary as an inside region 
and designating a second region bordered by the key boundary as an outside region; 

applying a softness function to the key boundary to generate a gradient in a 
key signal between said inside region and said outside region; and 
25 adjusting said softness function in response to a control signal input by a 

human operator so that a slope of said gradient is increased on a side adjacent said 
outside region without changing a slope of said gradient on a side adjacent said inside 
region. 

24. A method of generating an edge information field, the method comprising the 
30 steps of: 
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applying an edge detector algorithm to pixel information arrayed in a region of 
interest in an image plane, said edge detector algorithm generating edge information 
from said pixel information; and 

applying a bias function to said edge information to emphasize components of 
5 said edge information at a central portion of said region of interest, thereby producing 
biased edge information. 

25. A method according to claim 24, further comprising the step of computing an 
edge gradient vector field from the biased edge information. 

26. A method of performing an edge detection algorithm which employs a snake 
10 driven by force field data, the method comprising the steps of: 

iteratively repositioning the snake in response to the force field data; and 
after each repositioning of the snake, applying a convergence test which 
includes: 

determining for each control point in the snake an amount by which 
15 said control point was moved in the respective repositioning in a direction 

normal to the length of the snake at said control point; 
averaging the determined amounts; and 
comparing the averaged amounts to a threshold. 

27. An image signal processing apparatus, comprising: 

20 a memory for storing image information which represents a dynamic sequence 

of images; 

a display device for displaying in an image plane at least a first image of the 
dynamic sequence of images; 

a processor connected to the memory; and 
25 a drawing device connected to the processor; 

wherein the processor is programmed to: 

cause the display device to display said first image of the dynamic 
sequence of images; 

receive signals generated by the drawing device; 
30 on the basis of the received signals, superimpose a drawing figure on 

the first image displayed by the display device, said drawing figure defining a 
region of interest in the image plane; 



BNSDOCID: <WO 0l26050A2_l_> 



WO 01/26050 PCT/US00/27347 

- 43 - 

apply an image segmentation algorithm to said first image to generate 
a first outline in said region of interest, the image segmentation algorithm 
being constrained to operate only within said region of interest; 

apply an algorithm to provide an indication of motion, between said 
5 first image and a second image of said sequence of images, of an object 

corresponding to said first outline; 

reposition the region of interest on the basis of the indicated motion of 
the object; and 

apply the image segmentation algorithm to the second image to 
10 generate a second outline in the repositioned region of interest. 

28. An apparatus according to claim 27, wherein said drawing figure defining said 
region of interest is a free-hand drawing figure which defines a band-shaped region 
formed as the locus of a circle moved in an arbitrary manner. 

29. An apparatus according to claim 27, wherein said image segmentation 
15 algorithm includes an edge detection algorithm. 

30. An apparatus according to claim 27, wherein said drawing device includes at 
least one of a mouse and a tablet/stylus arrangement. 

31. An image signal processing apparatus, comprising: 

a memory for storing image information which represents an image; 
20 a display device for displaying in an image plane the image represented by the 

image information; 

a processor connected to the memory; and 
a drawing device connected to the processor; 
wherein the processor is programmed to: 
25 cause the display device to display the image represented by the image 

information; 

receive signals generated by the drawing device; 

on the basis of the received signals, render a free-hand drawing figure on the 
display device, said drawing figure defining a band shaped region formed as the locus 
30 of a circle moved in an arbitrary manner; 

apply an image analysis algorithm to the image information, said image 
analysis algorithm being constrained to operate only within the region of interest 



BNSDOCID: <WO 012605OA2_l_> 



WO 01/26050 



PCT/USOO/27347 



-44- 

defined by the free-hand drawing figure, said image analysis algorithm operating 
without reference to any portion of said image outside said region of interest; and 
segment the image plane on the basis of a result of said application of said 
image analysis algorithm. 
5 32. An apparatus according to claim 31, wherein said processor segments the 
image plane by: 

computing a gradient vector flow field from edge information within said 
region of interest; 

initializing a snake within said region of interest; 
10 iteratively repositioning the snake within said region of interest until a 

convergence test is satisfied; and 

segmenting the image plane on the basis of a repositioned snake which 
satisfies the convergence test. 

33. An apparatus according to claim 31, further comprising: 
15 a storage device connected to the memory for providing image information to 

be stored in the memory; and 

a color correction device connected to the processor; 

wherein the processor segments the image plane to generate a key signal and 
transmits the key signal to the color correction device. 
20 34. An apparatus according to claim 31, wherein said processor is programmed to 
permit a user of the apparatus to designate, by means of said drawing device, a first 
region bordered by said region of interest as an inside region and second region 
bordered by said region of interest as an outside region. 

35. An apparatus according to claim 31, wherein the image analysis algorithm 
25 includes an edge detection function. 

36. A digital memory which stores a program for instructing a processor to 
segment an image plane on the basis of features of a dynamic sequence of images 
displayed in the image plane, the program including instructions for: 

causing a first image of the sequence of images to be displayed on a display 

30 device; 

designating a region of interest in the image plane; 
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applying an image segmentation algorithm to the displayed image, to generate 
a first outline in the region of interest, the image segmentation algorithm being 
constrained to operate only within said region of interest; 

applying an algorithm to provide an indication of motion, between said first 
5 image and a second image of said sequence of images, of an object corresponding to 
said first outline; 

repositioning said region of interest on the basis of the indicated motion of 
said object; and 

applying said image segmentation algorithm to said second image to generate 
1 0 a second outline in said repositioned region of interest. 

37. A digital memory which stores a program for instructing a processor to 
segment an image plane on the basis of features of an image displayed in the image 
plane, the program including instructions for: 

causing the image to be displayed on a display device; 
15 receiving signals from a drawing device and, on the basis of the received 

signals, superimposing a free-hand drawing figure on the image displayed on the 
display device, said free-hand drawing figure defining a band-shaped region of 
interest formed as the locus of a circle moved in an arbitrary manner; 

applying an image analysis algorithm to the displayed image, said image 
20 analysis algorithm being constrained to operate only within said region of interest 
defined by said free-hand drawing figure, said image analysis algorithm operating 
without reference to any portion of said image outside of said region of interest; and 

segmenting the image plane on the basis of a result provided by application of 
said image analysis algorithm. 
25 38. A method of dynamically segmenting an image plane on the basis of features 
of a dynamic sequence of images displayed in the image plane, the method 
comprising the steps of: 

applying an image segmentation algorithm to a first image of the 
sequence of images to generate an outline in said image plane; 
30 detecting at least one characteristic of said first image in at least one 

area adjacent to said outline to generate signature map data; 



BNSDOCID: <WO O12605OA2 I > 



WO 01/26050 PCT/US00/27347 

-46- 

applying an algorithm to provide an indication of motion, between said 
first image and a second image of said sequence of images, of an object 
corresponding to said outline; 

repositioning said outline in said image plane on the basis of the 
5 indicated motion of said object; 

detecting features of said second image in a region of interest defined 
around said repositioned outline; 

detecting said at least one characteristic of said second image, in at 
least one area adjacent to said detected features, to generate signature data for said 
10 second image; 

comparing said signature data for said second image with said 
signature map data; and 

selectively suppressing components of said detected features on the 
basis of a result of said comparing step. 
15 39. A method according to claim 38, wherein said step of detecting features of 
said second image includes detecting edges in said second image. 
40. A method according to claim 38, wherein said signature map data includes 
direction data indicative of a direction normal to said outline generated by said image 
segmentation algorithm; and said direction data is used in said step of detecting said at 
20 least one characteristic of said second image. 
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